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In  this  survey  of  artificial  intelligence  research,  the  substantive  focus  is 
heuristic  programing,  problem  solving  and  closely  associated  learning  models. 

The  focus  in  time  is  the  period  1963-1968.  Brief  tours  are  made  over  a  variety 
of  topics:  generality,  integrated  robots,  game  playing,  theorem  proving,  semantic 
information  processing,  etc. 

One  program,  which  employs  the  heuristic  search  paradigm  to  generate  explanatory 
hypotheses  in  the  analysis  of  mass  spectra  of  organic  molecules,  is  described  in  1 
some  detail.  The  problem  of  representation  for  problem  solving  systems  is  discussed^ 
Various  centers  of  excellence  ir  the  artificial  intelligence  researcn  area  io 
mentioned.  A  bibliography  of  It  references  is  given. 
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substantive  focus  is  heuristic  programming,  problem  solving, 
and  closely  associated  learning  models.  The  focus  in  time  is 
the  period  1^63-1^8.  DiTef  tours  are  made  over  a  variety  of 
topics:  generality,  integrated  robots,  game  playing,  theorem 
proving,  semantic  information  processing,  etc. 

One  program,  which  employs  the  heuristic  search  paradigm  to 
generate  explanatory  hypotheses  in  the  analysis  of  mass 
spectra  of  organic  molecules,  is  described  in  some  detail. 

The  problem  of  representation  for  problem  solving  systems  is 
discussed.  Various  centers  of  excellence  in  the  artificial 
intelligence  research  area  are  mentioned.  A  bibliography  of 
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ARTIFICIAL  INTELLIGENCE  t  THEMES  IN  THE  8ECCND  DECADE 


by  Edward  A.  Feigenbaum 


The  purpose  of  this  talk  is  to  survey  recent  literature  in  artificial 
Intelligence  research,  and  to  delineate  and  assess  trends  in  the  research. 
For  an  infant  field  of  research  that  has  been  growing  as  rapidly  as  this 
oot<  has,  with  emphasis  on  pragmatics  and  techniques,  without  benefit  of 
much  theoretical  underpinning,  both  the  delineation  and  assessment  present 
problems . 

The  most  memorable  scientific  talk  I  ever  attended  was  delivered 
entirely  impromptu  to  an  informal  Stanford  group  by  my  colleague 
Professor  Joshua  Lederberg.  The  talk  ranged  over  research  in  what  might 
be  called  "RNA  and  DNA  information  processing".  Though  his  interests 
range  broadly,  the  ground  he  covered  that  day  was  clearly  his  own  ground-- 
a  territory  in  which  he  has  i  ;w  peers. 

Like  a  double  helix,  his  talk  had  two  intertwined  strands.  One 
strand  carried  the  basic  information  on  what  experiments  had  been  carried 
out  and  the  empirical  findings.  The  other  strand  consisted  of  Lederberg’ s 
personal  scientific  assessment  of  the  quality  of  individual  experiments 
and  the  value  of  the  results;  of  judgments  as  to  the  potential  fruitful¬ 
ness  of  pursuing  certain  lines  of  endeavor,  and  the  likely  unfruitfulness 
of  pursuing  others;  of  an  assessment  of  core  issues  needing  resolution 
vs.  issues  that  were  merely  interesting  but  peripheral;  and  of  many  other 
threads  of  an  evaluative  nature.  In  general,  this  strand  of  the  talk 
consisted  of  comments  covering  a  broad  spectrum,  from  these  for  which 
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there  was  a  strong  scientific  justification  ("almost  proven")  to  those 
based  on  the  subjective  and  intuitive  feelings  that  long  experience  in 
a  field  is  supposed  to  gi- e  ("I  have  a  hunch  that  ...  ").  In  sum,  the 
listener  was  left  with  a  mental  map  of  the  problem-experiment-theory  mar* 
that  constituted  the  current  state  of  this  area  of  molecular  biology 
research,  with  values  for  present  status  and  futures  associated  with  the 
alternate  paths  through  the  maze. 

This  double -stranded  approach  is  the  model  I  have  taken  for  what  a 
survey  should  attempt.  It  should  be  something  other  than  a  comprehensive 
set  of  pointers  into  the  literature.  Careful  selection  based  on  sometimes 
personal  criteria  of  relevance  and  importance  is  essential;  evaluations 
based  on  sometimes  subjective  criteria  of  plausibility  and  potential  are 
useful. 

This  is  a  talk,  not  a  book,  so  I  can  not  survey  all  the  areas  that 
have  a  rightful  place  under  the  umbrella  of  "artificial  intelligence 
research".  My  choice  of  topic  headings  is  not  intended  as  a  definition 
of  "artificial  intelligence"  by  implication.  Vigorous  subareas,  with 
their  own  scientific  "culture"  and  established  publishing  patterns,  were 
left  to  fend  for  themselves.  Thus,  for  example,  the  strong  subarea  that 
calls  itself  "pattern  recognition  research"  was  not  surveyed,  nor  was  the 
linguistics -translation  subarea,  bionics,  neurophysiological  information 
processing  models,  and  others. 

The  focus  of  this  talk  is  heuristic  programming,  problem  solving, 
and  closely  associated  learning  models.  Within  the  beam  of  this  spotlight, 
I  will  concentrate  on  research  of  the  period  1963-68,  since  I  feel  that 
the  book  Computers  and  Thought  (21)  is  already  an  adequate  reference  work 


for  the  1956-62  period.  As  a  practical  measure,  I  will  use  the  abbreviation 


"A. I."  for  "artificial  intelligence". 


Some  Global  Characteristics  of  the  A.  I.  Research  Endeavor 

Of  prime  interest  is  the  explosion  of  problems  attacked,  projects 
established,  and  reports  published  in  the  past  five  years.  In  spite  of 
this  rapid  growth,  quality  has  been  maintained  at  a  reasonably  high  level, 
in  my  opinion.* 

From  the  very  beginning  of  the  A. I.  research  at  Carnegie  Tech  in 
1955-56  (which  I  regard  as  The  Beginning  for  all  practical  purposes), 
Newell  and  Simon  called  their  research  "Complex  Information  Processing". 
They  still  do,  though  many  projects  have  been  born  since  as  "artific,  1 
intelligence  projects".  In  this,  Newell  and  Simon  are  to  be  credited  with 
considerable  foresight.  For  A. I.  research  is  becoming  ever  more  enmeshed 
at  its  periphery  with  other  areas  of  computer  science  research  and 
application  that  can  well  be  described  as  "complex  information  processing". 
For  example,  is  the  research  on  intelligent  question-answering  programs 
still  to  be  regarded  as  A. I.  research,  or  is  it  the  natural  direction  for 
progress  in  the  field  called  information  retrieval  research?  Is  the 


# 

Some  observers  have  commented  upon  a  dip  in  productivity  in  the  period 
1960-63,  and  this  appears  to  be  documentable.  I  believe  that  this  was 
attributable  to:  a  shift  of  emphasis  at  some  of  the  major  centers  toward 
technological  problems  of  tool  building;  a  much-needed  reassessment  of 
the  implications  and  significance  of  efforts  of  the  late  1950's;  a  sub¬ 
stantive  shift  of  attention  to  problems  for  which  a  long  gestation  time 
was  needed  (e.g.  natural  language  analysis,  integrated  robots,  represen¬ 
tation);  and  che  establishment  of  academic  computer  science  departments, 
programs,  curricula,  etc.,  which  absorbed  a  significant  portion  of  the 
energies  of  the  available  talent.  Each  of  these  was  to  have  its  eventual 
payoff  in  the  productive  1963-68  period. 
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effort  to  develop  a  problem  solving  program  to  write  computer  operating 
systems  (2U)  an  A. I.  effort  or  is  it  research  in  systems  programming? 

Is  a  program  (22)  that  forms  chemical  hypotheses  in  the  analysis  of  mass 
spectra  of  organic  molecules  a  piece  of  A. I.  research,  or  is  it  chemistry? 

These  questions  are  not  as  trivial  as  the  obvious  "Who  cares.'"  answer 
would  make  them  seem.  There  is  a  general  tendency  in  virtually  all  lines 
of  scientific  endeavor  for  research  disciplines  to  fragment  into  special¬ 
ities  as  the  early  and  difficult  problems  become  better  understood  and 
as  practitioners  move  into  the  discipline  to  make  use  of  the  results. 
"Successes"  are  then  attributed  to  the  specialities,  ignoring  the  con¬ 
tributions  from  the  spawning  discipline. 

In  A. I.  research,  examples  of  this  process  at  work  are  numerous. 
Consider  character  recognition  (i.e.  what  those  "optical  readers"  do). 

Much  of  the  early  work  on  the  pattern  recognition  problem  focused  on 
character  recognition  as  an  interesting  initial  task.  This  research, 
circa  1955,  was  motivated  more  by  the  question,  "What  are  the  interesting 
kinds  of  l>ehavior  that  a  computer  might  be  made  to  perform,  in  contrast 
to  the  mundane  tasks  of  the  day  (such  as  calculating  function  tables 
and  payrolls)?"  than  by  the  question,  "How  can  we  make  a  machine  read 
characters  of  the  alphabet  reliably?"  (62)  The  pursuit  of  the  more 
general  question  inspired  the  early  work  on  problem  solving  programs. 
Eventually  the  line  of  research  turneu  into  the  applied  art  of  designing 
character  recognition  machines  and  thereby,  for  all  practical  purposes, 
passed  out  of  the  field  of  A. I.  research.  List  processing  provides  another 
example.  As  a  body  of  techniques  it  was  invented  (as  far  as  I  can  deter¬ 
mine)  by  Newell,  Shaw,  and  Simon  for  handling  the  complex  problems  of 


memory  allocation  and  heirarchical  (and  recursive)  control  of  processing 
in  the  Logic  Theory  program  and  the  earliest  version  of  the  General 
Problem  Solver.  List  processing  received  additional  refinement  by 
another  A. I.  researcher,  McCarthy  (LISP).  It  underwent  further  change 
(threaded  lists,  knotted  lists,  symmetric  lists,  -ic.)  as  it  made  the 
transition  from  "something  those  A. I.  researchers  are  doing"  to  "software 
system  techniques".  By  now  list  processing  is  an  every-day  working  tool 
of  a  number  of  specialty  areas,  particularly  compiler  and  operating  system 
implementation. 

Every  discipline  thrives  on  its  successes,  particularly  in  terms  of 
attracting  talented  individuals  and  research  support  funds.  There  is  a 
danger  that  the  A. I.  area,  as  the  residual  claimant  for  the  problems  not 
yet  solved,  the  problems  not  yet  well  understood,  the  problems  for  which 
failure  was  the  reward  for  the  initial  foray,  will  come  to  be  viewed  as 
the  "no  win"  area  of  computer  science  and  the  home  of  the  "pie-in-the- 
sky  guys".  There  is  a  scattering  of  evidence  that  such  a  process  is 
already  at  work,  and  I  regard  this  as  most  unfortunate  and  undeserved. 

Finally,  the  rapid  growth  of  A. I.  research  has  made  the  "Invisible 
College"  of  the  area  less  viable.  There  is  now  a  Special  Interest  Group 
within  the  ACM  for  Artificial  Intelligence;  a  new  journal  is  being 
prepared;  and  an  international  conference  is  being  organized. 

The  Search  for  Generality 

This  is  the  title  of  the  Newell-Ernst  IFIF65  paper,  one  that  deserves 
rereading  (^8).  Others  have  since  Joined  the  quest.  There  appear  to  be 
two  roads:  the  high  road  and  the  low  road. 
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Those  who  walk  the  high  road  seek  a  generality  of  the  total  problem 

solving  system  that  will  allow  a  core  of  problem  solving  methods  that  are 

not  task-specific  to  discover  solutions  in  a  wide  variety  of  problem 

domains.  Here  the  problem  of  the  internal  representation  in  terms  of 

which  the  core  of  methods  will  operate  is  crucial.  If  the  representation 

is  made  general  enough  so  that  each  new  application  does  not  have  to  be 

tortured  to  fit  into  it,  can  the  methods  and  associated  processes  be  made 

general  enough  to  cope  with  it  without  a  consequent  loss  of  problem 

solving  power?  We  lack  a  good  understanding  yet  of  this  problem  of 

generality  and  representation.  A  view  of  existing  problem  solving  programs 

would  suggest,  as  common  sense  would  also,  that  there  is  a  kind  of  "law 

of  nature"  operating  that  relates  problem  solving  generality  (breadth  of 

applicability)  inversely  to  power  (solution  successes,  efficiency,  etc.), 

and  power  directly  to  specificity  (task-specific  information).  We  do 

not  now  know  how  to  write  problem  solvers  that  will  accept  problems  in 

a  rather  general  representation  at  the  start  but  then  alter  the  repre- 
> 

sentation  systematically  toward  greater  specificity  and  power  as  more 
problem-specific  information  becomes  available  during  the  problem  solving. 
An  example  of  this  process  has  been  worked  out  in  detail  (1» ) ,  but 
mechanization  is  not  in  view. 

The  General  Problem  Solver  traveled  the  high  road  alone  for  nearly 
a  decade  and  established  the  search  for  generality  as  a  viable  research 
path.  Ernst  and  Newell's  new  monograph  (lT),  exploring  the  successes  and 
problems  encountered  in  applying  GF^  to  a  variety  of  task  environments, 
appears  to  signal  the  end  of  the  first  phase  of  the  GPS  trek.  More 
recently,  GPS  has  acquired  a  lif  of  its  own,  independent  of  its  creators. 
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fat  example,  the  GR  paradigm  has  appeared  in  a  slightly  more  generalized 
form  as  a  program  caA  ed  FORTRAN  Deductive  System  (5M;  and  considerably 
transfigured  as  the  Graph  Traverser  (15,  16).  Another  GPS  variant  has 
Just  emerged  in  Sweden  (6l). 

Travelers  on  the  low  road  seek  not  a  general  problem  solving  system 
but  theorems  and  generalizations  of  technique  concerning  underlying 
mechanisms  that  common  to  a  class  of  problem  solving  programs.  Perhaps 
the  best  example  involves  the  heuristic  search  paradigm. 

As  it  was  a  decade  ago,  the  central  paradigm  of  A. I.  research  is 
heuristic  search.  A  tree  of  "tries"  (aliases:  subproblems,  reductions, 
candidates,  solution  attempts,  al -.ernatives-and-consequencer.,  etc.)  is 
sprouted  (or  sproutable)  by  a  generator.  Solutions  (variously  defined) 
exist  at  particular  (unknown)  depths  along  particular  (unknown)  paths. 

To  find  one  is  a  "problem".  For  any  task  regarded  as  nontrivial,  the 
search  space  is  very  large.  Rules  and  procedures  called  heuristics  are 
applied  to  direct  search,  to  limit  search,  to  constrain  the  sprouting  of 
the  tree,  etc. 

While  some  of  this  tree-searching  machinery  is  entirely  task-specific, 
other  parts  can  be  made  quite  general  over  the  domain  of  designs  employing 
the  heuristic  search  paradigm.  The  so-called  "alpha-beta"  procedure  is 
a  classical  example  (70,  60).  Its  employment  is  "obvious"  if  one  is 
careful  and  thoughtful  about  search  organization.  It  was  employed  as 
early  as  1958  in  the  Neveil-Shaw-Simon  chess  program,  it  being  so  much 
a  part  of  the  underlying  machinery  that  the  employers  did  not  consider 
it  worthy  of  bringing  to  the  attention  of  others. 
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But  what  is  obvious  to  some  is  not  obvious  to  others.  Each  new 


program  designer,  flirting  with  the  heuristic  search  paradigm,  should 
not  be  forced  to  reinvent  (or  what  is  worse  pass  over  in  ignorance) 
generally  applicable  search-facilitating  procedures,  particularly  if  the 
procedures  are  subtle. 

A  small,  but  growing,  body  of  knowledge  of  this  type  has  emerged. 

The  work  of  Slagle  is  noteworthy,  particularly  his  MULTIPLE  (7^'>  which 
is  in  effect  a  "system"  of  such  techniques.  Other  papers  of  interest 
are  those  of  Nilsson  ("minimum  cost  paths")  (50),  Floyd  ("nondetermini Stic 
algorithms")  *(23)  and  Golomb  and  Baumert  ("backtrack  programming")  (25). 

In  a  sense  the  travelers  on  the  low  road  are  tool  builders,  but  their 
tool  building  is  often  of  an  abstract  and  elegant  sort. 

Integrated  Robots 

History  will  record  that  in  1968,  in  three  major  laboratories  for 
A. I.  research,  an  integrated  robot  consisted  of  the  following: 

a.  a  complex  receptor  (typically  a  television  camera  of 
some  sort)  sending  afferent  signals  to  . . . 

b.  a  computer  of  considerable  power;  a  large  core  memory; 
a  variety  of  programs  for  analyzing  the  afferent  video 
signals  and  making  decisions  relating  to  the  effectual 
movement  of  . . . 

c.  a  mechanical  arm-and-hand  manipulator  or  a  motor-driven 
cart. 

The  intensive  effort  being  invested  on  the  envelopment  of  computer 
controlled  hand-eye  and  eye-cart  devices  is  for  me  the  most  unexpected 
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occurrence  in  A. I.  research  in  the  1963-68  period. 

Research  of  this  type  began  effectively  with  Ernst's  thesis  on  a 
computer  controlled  mechanical  hand  (MH-l)  (l8).  He  wrote  interesting 
heuristic  programs  for  solving  problems  of  manual  manipulation  in  a 
real  environment.  MH-l  was  almost  totally  "blind",  but  it  did  store  a 
symbolic  internal  representation  of  the  external  situation  (a  "model") 
in  terms  of  which  it  did  its  problem  solving.  The  seminal  piece  of 
research  for  the  visual  ("eye")  processing  was  the  oft-cited  thesis  of 
Roberts  (58)  on  the  three-dimensional  perception  of  solids  from  two- 
dimensional  picture  input. 

The  three  current  robot  projects  are  direct  descendents.  They  are: 
the  Stanford  Hand-Eye  Project  (McCarthy,  et.  al. ),  the  MIT  Hand-Eye 
Project  (Minsky  and  Papert),  and  the  Stanford  Research  Institute's  Robot 
Project  (Nilsson,  Raphael,  Rosen,  e£.  al. ). 

Not  much  information  about  these  projects  has  been  published.  Hence, 
what  follows  is  to  some  extent  anecdotal. 

As  one  might  expect,  the  design,  implementation,  and  use  of  the 
robot  hardware  presents  some  difficult,  and  often  expensive,  engineering 
and  maintenance  problems.  If  one  is  to  work  in  this  area  solving  such 
problems  is  a  necessary  prelude  but,  more  often  than  not,  unrewarding 
because  the  activity  does  not  address  the  questions  of  A.l.  research  that 
motivate  the  project.  Why,  then,  build  devices?  Why  not  simulate  them 
and  their  environments?  In  fact,  the  SRI  group  has  done  good  work  in 
simulating  a  version  of  their  robot  in  a  simplified  environment.  (A 
film  of  this  is  available.)  So  it  can  be  done  and  the  questions  raised 
above  are  relevant. 
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The  answer  given  is  as  follows.  It  is  felt  by  the  SRI  group  that 

the  most  unsatisfactory  part  of  their  simulation  effort  was  the  simulation 

of  the  environment.  Yet,  they  say  that  90^>  of  the  effort  of  the  simulation 

* 

team  went  into  this  part  of  the  simulation.  It  turned  out  to  be  very 
difficult  to  reproduce  in  an  internal  representation  for  a  computer  the 
necessary  richness  of  environment  that  would  give  rise  to  interesting 
behavior  by  the  highly  adaptive  robot.  It  is  easier  and  cheaper  to  build 
a  hardware  robot  to  extract  what  information  it  needs  from  the  real  world 
than  to  organize  and  store  a  useful  model.  Crudely  put,  the  SRI  group's 
argument  is  that  the  most  economic  and  efficient  store  of  information 
about  the  real  world  is  the  real  world  itself. 

The  task  of  building  an  integrated  robot  is  one,  I  believe,  that 
contains  the  possibility  of  studying  some  proolems  of  major  interest  in 
artificial  intelligence  research,  among  which  are:  strategy  formation  and 
planning;  the  problem  of  representing  situations  for  problem  solving 
processes  and  subsequent  modification  of  representations  as  new  information 
becomes  available;  and  visual  perceptual  processing.  Of  the  three  groups, 
only  the  SRI  group  has  published  papers  discussing  the  more  general  arpects 
and  goals  of  this  research  (59,  57)* 

Both  the  MIT  and  Stanford  University  groups  have  worked  on  programs 
for  controlling  a  variety  of  arm-hand  manipulators,  from  the  very  simple 
to  the  very  complex,  from  the  anthropomorphic  variety  to  the  very  non- 
anthropomorphic.  None  of  the  more  esoteric  manipulators  seem  to  have 
worked  out  very  well,  though  there  is  no  published  documentation  of 
successes,  failures,  and  reasons. 
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Visual  scene  analysis  programs  are  important  in  all  of  these  projects. 
Host  of  the  programming  effort  is  being  invested  to  build  the  proper  tools 
and  techniques  to  gain  control  of  this  piece  of  the  task.  The  scene 
analysis  problem  is  this:  the  TV  image  of  a  scene  (digitized)  is  avail¬ 
able  to  be  read  into  the  computer  memory;  scan  and  process  it  as  necessary 
to  produce  a  symbolic  description  of  the  objects  in  xne  scene  and  their 
various  interrelationships.  Guzman  (29)  at  MIT  attacked  the  scene  analysis 
task  in  a  somewhat  abstracted  form  (no  TV  camera,  a  no-noise  symbolic 
simulation  of  the  input  scene)  with  striking  success.  Guzman's  work, 

In-  identally,  should  be  of  interest  to  psychologists  working  on  models 
of  human  visual  perception  processes. 

The  name  of  the  game  however  is  integrated  behavior  on  a  problem. 

Both  the  MIT  and  Stanford  University  hand-eye-computer  aggregates  have 
performed  a  few  types  of  bxOck-,"nding  and  block-stacking  behaviors.  A 
paper  in  the  IFIP68  Proceedings  describes  the  Stanford  work  (53);  I  have 
rot  round  a  paper  describing  the  MIT  block-stacking  activity. 

to  you  want  to  build  an  integrated  robot?  Wait.'  The  three  lively 
groups,  whose  levels  of  talrnt  and  funding  are  hard  to  match,  have  not 
y.M  uncovered  all  the  firs' -level  problems.  These  will  be  found,  reported, 
«  ..1  st-  osed,  perhaps  wit!  n  ‘he  next  two  years.  The  projects  are  still 
v-r>  in  the  tool-build  g  and  debugging  stage.  Whether  the  integrated 

robo-  is  a  useful  and  approp’ iate  task  for  making  progiess  on  the  generax 
problems  of  A. I.  research  remains  to  be  proven. 


Theorem  Proving 


Since  Robinson  is  presenting  an  invited  survey  paper  on  automatic 
theorem  proving  at  this  conference,  it  would  be  inappropriate  for  me  to 
survey  this  literature  here.  But  perhaps  a  few  comments  and  a  few 
citations  are  in  order. 

Many  in  the  field  persist  in  calling  their  theorem  proving  programs 
deductive.  programs  (thus,  the  aforementioned  FORTRAN  Deductive  System, 
DEDUCOM  (69)>  the  term  "deductive  question-answering  programs";  Hunt's 
survey  (32)  contains  numerous  instances  of  this  misuse).  This  is  a 
terminological  mistake.  If  one  looks  carefully  at  how  these  programs 
find  their  proofs,  much  more  than  deduction  in  the  strict  sense  is  involved. 
When  a  theorem  proving  program  upplies  a  r_I  of  inference,  say  modus  ponens, 
in  taking  a  trial  step  forward,  it  is  clearly  making  an  elementary  deduction. 
But  the  search  for  a  proof  is  not  a  deduction.  In  practice,  the  termino¬ 
logical  error  has  tended  to  inhibit  clear  thinking  on  key  problems,  for 
example  those  involved  in  the  attempt  to  unify  parts  of  the  field  (like 
heuristic  search)  by  a  careful  examination  of  the  basic  mechanisms  used 
in  a  variety  of  successful  programs.  The  useage  I  am  vilifying  is  not 
harmless  because  it  tends  to  sort  the  work  of  the  field  into  the  wrong 
categories. 

I  prefer  Amarel's  term,  "problems  of  derivation  type",  as  correct, 
clear,  and  meaningful.  1  feel  the  term  "discovery  processes"  is  an 
appropriate  and  useful  one  for  describing  the  processes  by  which  the 
proof  of  a  theorem  or  the  move  from  a  chess  position,  or  the  chemical 
hypothesis  that  explains  a  mass  spectrum,  etc.)  is  found. 
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Robinson's  resolution  method  for  theorem  proving  in  the  predicate 
calculus  has  received  much  attention  in  the  past  few  years.  Unfortunately, 
this  has  been  accompanied  by  a  sentiment  that  the  resolution  method 
"liberates"  one  from  the  guessy,  messy  chaotic  world  of  heuristic  search. 
Again  a  false  distinction,  based  on  unclear  understanding  either  of 
resolution  or  of  existing  heuristic  search  proof-finding  programs  or 
both,  sorts  the  world  along  the  wrong  lines.  The  resolution  method  does 
provide  a  systematic  formal  mechanism  for  guaranteeing  the  completeness 
of  the  space  of  proof  candidates,  but  it  does  not  by  itself  deal  with  the 
well-known  and  inevitable  problem  of  tne  prolifeiation  of  problem  states 
(53a).  Thus  search  strategies  have  been  overlaid  to  bring  about  effective 
problem  solving  using  reuol  ition  (e.g.  "unit  preference",  "set  of  support"). 
The  net  result  ir  that  the  processes  these  programs  carry  out  are  much  the 
same  as  (in  some  cases,  identical  to)  those  carried  out  by  the  heuristic 
search  proof-finding  programs  that  are  thought  to  be  so  different  ( 17 ) -  I 
predict  that  much  more  will  be  heard  on  this  issue  in  the  second  decade. 

In  1959,  McCarthy  proposed  a  class  of  programs  (advice-takers)  that 
would  reason  about  the  world  in  a  common-sense  way.  The  "world"  was  to 
be  given  a  homogeneous  representation  in  the  predicate  calculus.  Problems 
which  presented  themselves  would  be  solved  as  proofs  over  the  space  of 
expressions.  Recertl,/,  Green  and  Raphael  (c6)  have  incorporated  the 
machinery  of  the  resolution  method  as  the  proof-finding  "engine"  in  an 
advice-taker-like  question-answering  and  fact  retrieval  program.  The 
idea  is  important  (and  works),  but  as  Just  mentioned  *he  necessary  heuristic 
overlay  to  guilt  *  he  search  effort  will  have  to  be  provided  if  the  system 
is  to  be  useful  and  practical. 


Game  Playiug  Programs 


In  the  first  decade,  there  were  those  who  wrote  chess  playing 
programs  because  chess  provided  an  interesting  and  complex  task  environ¬ 
ment  in  which  to  study  problem  solving  processes  (the  capstone  of  this 
line  of  research  is  a  gen  of  a  paper  by  Newell  and  Simon  (U9)  examining 
in  grv at  detail  an  example  of  human  chess  play  in  the  light  of  what  we  have 
come  to  understand  about  problem  solving  processes  in  chess  from  building 
chess  playing  programs).  There  were  others  who  wrote  chess  programs 
because  the  activity  presented  such  a  challenge:  chess  is  THE  great 
centuries-old  human  intellect’  1  diversion. 

Such  a  group  is  Greenblatt,  at.  al.  They  have  written  the  first 
program  that  play*  very  good  (but  not  yet  expert)  chess.  I  have  seen  only 
one  paper  on  the  Greenblatt  program  (27).  Along  with  a  brief  description, 
it  gives  some  examples  of  the  program's  play.  Competing  against  humans 
under  ordinary  tournament  rules,  it  is  said  to  have  won  a  Class  D  tourna¬ 
ment  in  Boston  in  mid-1967,  reportedly  beating  a  Class  C  player  in  the 
process.  It  is  also  reported  to  be  much  better  by  now.  Apparently,  its 
most  celebrated  victory  was  a  handy  win  ever  Hubert  Dreyfus. 

Why  does  it  play  so  well  as  compared  with  previous  chess  programst 
I  do  not  know  anyone  whr  yet  has  a  convincing  answer  to  this  (partially 
because  of  the  paucity  of  information  about  the  program).  As  I  view  it, 
the  program  embodies  no  fundamentally  new  ideas  about  how  to  organize 
chess  playing  programs.  It  contains  much  more  specific  information  about 
chess  play  than  any  previous  program  had.  Computer  time,  top  programmer 
talent,  fancy  tools  of  the  second  decade  (CRT  displays,  interactive  access, 
big  core  memory)  --  the  patient  has  received  an  order  of  magnitude  more 


loving  care  than  any  other  previous  patient  (all  other  patients,  you 
will  remember*  were  released  in  a  weak  condition;  moat  died).  Finally, 
an  excellent  "learning  loop"  is  a vail able --through  a  human  at  the  console. 
Blunders  are  analyzed,  and  quickly  fixed  with  patches  and/or  new  chess 
knowledge.  The  system-wide  effects  of  the  patch  or  the  new  knowledge,  if 
any,  can  be  fairly  quickly  detected,  and  revised  if  causing  problems. 

It  is  a  feasible  way  to  improve  (or  educate)  a  program,  and  a  useful  one 
if  you  are  interested  in  a  high  level  of  performance  in  a  specific  task 
rather  than  in  general  models  for  the  organization  of  problem  solving. 

I  foresee  this  technique,  albeit  in  more  sophisticated  forms,  being  widely 
adopted  in  the  second  decade. 

The  firat  international  cournament  between  chess  playing  programs 
was  won  by  a  program  developed  in  the  Soviet  Union  at  the  Institute  for 
Theoretical  end  Applied  Physics  in  Moscow  (by  Adelson-Velskii,  et.  al; 
no  writeup  available).  The  loser  was  the  old  MIT  program,  developed  by 
Kotok  (37)  and  slightly  modified  by  McCarthy's  group  at  Stanford.  The 
play  is  available  for  inspection  in  the  SICART  Bulletin  (56).  Neither 
program  played  well  when  compared  with  the  average  level  of  performance 
of  the  Greenblatt  program. 

Samuel's  well-publicized  checker  playing  program  has  undergone 
extensive  revision  (60),  and  now  stands  near  the  top  of  its  profession. 

The  major  revisions  are  in  the  area  of  the  learning  routines,  and  will 
be  discussed  later. 

Williams  (7*0  has  attacked  the  problem  of  modeling  t.ie  generality 
with  which  human  players  approach  common  board  and  card  games.  His 
program,  General  Game  Playing  Program,  is  given  as  input  a  description 
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of  the  objects  used  in  playing  the  game;  and  the  rules  of  the  game  taken 
from  Hoyle 1 s  Rules  of  Games,  transformed  straightforwardly  into  a  Hoyle- 
like  input  language.  The  program  will  then  play  at  least  a  legal  game, 
for  most  of  the  games  described  in  Hoyle . 

Machine  Learning  (Specifically,  Internal  Mechanisms;  not  Human-Directed) 

The  A. I.  field  still  hf.s  little  grasp  of  the  machine  learning  problem 
for  problem  solvers.  For  many  years,  almost  the  only  citation  worth  making 
was  to  Samuel's  famed  checker  playing  program  and  its  learning  system. 

(Great  interest  arose  once  in  a  scheme  proposed  by  Newell,  Shaw,  and 

Simon  for  learning  in  GPS,  but  the  scheme  was  never  realized.)  Surprisingly, 

today  we  face  the  same  situation. 

Samuel's  new  paper  (60)  describes  a  major  revision  of  the  position 
evaluation  scheme  of  the  checker  player  and  its  attendent  learning 
processes.  Evaluation  using  the  linear  polynomial  function  is  abandoned 
in  favor  of  complex  nonlinear  evaluation  processes.  The  features  of  positions, 
which  used  to  be  represented  as  terms  in  the  polynomial,  are  now  grouped 
according  to  their  (perceived)  interdependencies.  "Crude"  values  for 
the  features  are  used,  e.g.  3->5-,  or  7-  valued  functions.  (This  is  an 
old  idea,  whose  rationale  was  given  hs  far  back  as  1951  by  Simon  for 
chess-playing;  it  is  used  in  the  Newell-Shaw-Simon  Chess  Playing  Program.) 
Vectors  of  feature  values  are  used  to  ent  'r  "signature  tables"  at  the 
first  level.  A  table-lookup  takes  place;  th*5  resulting  table  value  is 
quantized  into  "crude"  states  (five-valued)  and  passed  up  to  a  second  level 
of  "signature  table"  aggregation  (over  a  set  of  first  level  Signature 
tables).  This  process  is  repeated  once  again  at  a  third  level  of 
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aggregation,  and  from  this  "top"  signature  table  a  final  score  for  the 
evaluation  emerges.  Samuel  shows  the  considerable  performance  advantages 
of  this  hierarchical  scheme  over  the  (already  quite  successful)  linear 
polynomial  evaluation. 

Samuel  titles  one  of  his  sections,  "The  Heuristic  Search  for  Heuristics", 
and  maintains  therein  "that  the  task  of  making  decisions  as  to  the  heuristics 
to  be  used  is  also  a  problem  which  can  only  be  attacked  by  heuristic  pro¬ 
cedures,  since  it  is  essentially  an  even  more  complicated  task  than  is  the 
playing  itself."  An  interesting  case  study  of  the  learning  of  heuristics 
by  a  heuristic  program  is  emerging  in  the  dissertation  research  of  Waterman 
at  Stanford  (72),  nearly  complete.  The  task  environment  is  draw  poker. 

The  heuristics  are  not  cast  as  programs  in  the  traditional  mode,  but  are 
"brought  to  the  surface"  as  situation-action  rules  in  a  "production"  list. 
Initially  the  list  contains  only  the  null  rule:  whatever  the  situation, 
play  randomly.  Basically,  four  things  can  happen  to  the  table  of  rules. 
Situation-action  rules  can  be  added.  The  order  of  the  rules  can  be  altered 
(since  the  table  is  scanned  in  a  top-to-bottom  fixed  order,  this  can  make 
a  big  difference  in  behavior).  A  rule  can  be  "generalized"  by  altering 
its  situation-side  so  as  to  ignore  certain  dimensions  of  the  game 
situation,  thereby  causing  it  to  "catch"  more  situations.  Or  the 
situation-side  can  be  "specialized"  so  as  to  be  more  discriminating 
among  game  situations  and  hence  "catch"  fewer  situations.  This  learning 
scheme  works  well  in  a  variety  of  different  training  procedures. 
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A  Note  In  Passing:  Turning  Inward  to  the  Programming  Task  Itself 


In  the  checker-playing  program,  Samuel  used  a  "rote  memory"  learning 
scheme,  in  which  many  checker  board  positions  were  stored  away  along  with 
their  scores  from  previous  look-ahead  search.  If  a  "memorized"  position 
were  encountered  in  a  new  analysis,  the  value  was  available  and  would  not 
have  to  be  recomputed.  Theorem  proving  programs  use  "rote  memory"  for 
analogous  purposes  in  building  their  theorem  memories.  So  do  many  other 
programs,  e.g.  Heuristic  DENDRAL,  described  later. 

The  issue  of  store  vs.  recompute  is  quite  general  and  classical.  One 
looks  forward  to  the  day  when  the  programming  system  one  is  using  is  smart 
enough  to  figure  out  when  it  should  assign  function  values  by  table  lookup 
in  a  rote  memory  it  builds  and  when  by  computation  (it  would  make  this 
decision  by  an  anulysis  of  the  uses  of  and  performance  character! sties 
of  the  function  as  it  encounters  this  information  during  execution).  A 
first  step  in  this  direction  has  recently  been  made  with  the  introduction 
of  "memo  functions"  into  the  Edinburgh  POP-2  language  (40).  With  memo 
functions,  however,  the  programmer  still  has  decisions  to  make,  and  I 
look  forward  to  a  further  "Samuelization"  of  programming  language  systems 
for  the  store  vs.  recompute  decision. 

Simon  (63)  once  wrote  a  problem-solving  program  (Heuristic  Compiler) 
that  solved  problems  of  writing  simple  programs  in  IPL-V,  given  descriptions 
of  how  the  programs  were  supposed  to  alter  the  state  of  the  IPL-V  machine. 

The  program  was  organized  as  a  simplified  version  of  GPS,  and  was  successful, 
but  it  has  not  been  followed  up  with  a  more  substantial  effort.  Ph.D. 
candidates,  where  are  you? 
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Semantic  Information  Processing  p.nd  Natural  Language 


This  area  of  research  is  of  great  importance  to  the  A. I.  endeavor. 

Itr,  importance  arises  not  So  much  from  the  presumed  practical  advantages 
of  being  able  to  converse  in  natural  language  wi  +  t,  a  program  that 
"understands",  but  because  the  research  raises  and  focuses  certain  key 
issueu  that  arc  quite  general.  Research  of  high  quality  has  been  done 
in  the  past  few  years.  A  book  (!*2)  covering  much  of  it  will  be  available 
shortly.  Minsky's  introduction  to  the  book  should  be  consulted  for  an 
extended  treatment  of  the  subject,  which  is  impossible  here.  Parts  of 
a  paper  by  Coles  (13)  also  give  a  good  treatment.  Nevertheless,  a  few 
comments  may  be  useful. 

The  research  grapples  in  various  ways  with  the  problem  of  the 
meaning  of  ordinary  natural  language  utterances  and  the  computer  under¬ 
standing  of  the  meaning  of  these  utterances  as  evidenced  by  its  subsequent 
linguistic,  problem- lolving,  or  question-answering  behavior.  Meaning  is 
viewed  not  as  something  one  "puts  into"  a  program  (for  example,  ordinary 
dictionary  entries  help  very  little)  but  as  an  emergent  from  the  interplay 
of  syntactic  analyzers,  models  that  link  to  leal-world  objects  and 
relations,  appropriate  data  structures  that  link  together  symbols  of  the 
internal-world,  a  logical  calculus  and  associated  discovery  processes  for 
solving  derivation-type  problems.  (This  list  is  not  necessarily  exhaustive.) 

£^>.cactic  analysis,  which  has  received  massive  attention  from  the 
computational  linguists  (with  elegant  results),  is  not  enough  to  handle 
problems  of  meaning  and  understanding.*  Consider  the  following  example. 

♦Overemphasis  on  syntax  analysis  at  the  expense  of  research  on  semantic 
processes  har  hindered  the  development  of  the  mechanical  translation 
area,  I  believe. 
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Bobrov's  STUDENT  (8)  is  a  problem  solver  that  accepts  natural 
language  input  (English  sentences).  High -school -level  algebra  wurd 
problems  constitute  the  domain  of  discourse.  The  sentences  are  simplified 
and  parsed;  idioms  are  transformed.  Reference  to  the  real-world  is  made 
through  a  table  of  global  relations.  Typically  the  amount  of  global 
information  made  available  to  STUDENT  has  been  quite  small;  hence, 

STUDENT*  understanding  of  the  algebra  word  problems  is  largely  "syntactic". 
The  appropriate  simultaneous  algebraic  equations  are  set  up  and  solved 
for  the  answer. 

STUDENT  can  be  made  to  solve  the  following  problem:  "A  board  was 
sawed  into  two  pieces.  One  piece  was  two-thirds  as  long  as  the  whole 
board  and  was  exceeded  in  length  by  the  second  piece  by  four  feet  How 
long  was  the  board  before  it  was  cut?"  STUDENT  will  show  that  in  one 
sense  it  understands  this  problem  by  issuing  the  correct  solution,  that 
the  board  length  va*»  minus  twelve  feet.  In  the  psychological  experiments 
done  by  Paige  and  Simon  in  the  light  of  STUDENT  (5l)>  some  subjects  solved 
the  problem  Just  as  STl/DENT  solved  it  (with  a  focus  on  the  syntax  leading 
to  the  correct  equation  system),  but  others  immediately  recognized  its 
physical  impossibility  and  refused  to  set  up  the  equations.  These  people 
were  the  model  builders,  who  attached  the  "givens"  to  the  appropriate 
model  of  the  physical  situation,  immediately  noticing  the  misfit*.  They 
were  exhibiting  another  level  of  "understanding  of  the  problem"  not 
available  to  STUDENT. 

* 

My  own  informal  replications  of  the  experiment  at  Stanford  led  me  to 
believe  that  the  effect  is  independent  of  the  brilliance,  or  lack  of 
it,  of  the  subject,  but  a  function  of  whether  he  is  a  "visualizer"  or 
a"symbolizer".  STUDENT,  of  course,  is  a  "symbolizer". 
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The  notion  of  interpretation  of  natural  language  in  terms  of  stored 
models  is  central  to  ouch  of  the  research  on  semantic  information 
processing.  Raphael's  Semantic  Information  Retrieval  question-answering 
system  (55)  uses  a  node-link  relational  model  (with  a  restricted  set  of 
relations)  to  organize  its  data  universe.  This  model  is  grown  to  incor¬ 
porate  new  information  about  objects  and  their  relationships  that  is 
extracted  from  a  simple  analysis  of  declarative  sentences  typed  in  at 
the  console  by  the  user.  Other  programs  use  internal  representations 
of  two-dimensional  pictures  as  a  model.  Coles  (13),  extending  the  work 
of  Kirsch,  et.  al.  (33)  on  the  Picture  Language  Machine,  wrote  a  program 
that  uses  a  picture  (..nput  by  the  user  at  a  CRT  with  a  light  pen)  to 
resolve  syntactic  ambiguities  in  English  sentences  about  the  picture, 
fluid  to  answer  questions  about  the  picture. 

Pushing  the  subject  of  models  and  data  structures  a  bit  further, 
consider  restructuring  a  traditional  dictionary  into  a  "semantic  graph" 
in  the  following  way.  Represent  entities  as  symbols  at  the  nodes,  and 
the  various  general  and  special  relations  between  entities  as  associative 
links  between  the  nodes.  An  entity's  node  is  the  start  of  a  subgraph 
which,  when  traced  out,  encompasses  the  various  "meanings"  of  the 
entity  in  terms  of  other  entities  and  relations  in  the  semantic  graph. 
Quillian  (53»)  has  constructed  such  a  graph  for  a  small  (but  definitely 
nontrivial)  set  of  dictionary  definitions,  and  a  processor  for  performing 
a  variety  of  associative  and  semantic  reference  tasks.  Quillian' s  program, 
I  believe,  is  a  good  foundation  for  further  research  on  models  of  human 
associative  memory.  This  possibility  is  worth  a  vigorous  pursuit. 


Simulation  of  Cognitive  Processes 


Space  limitations  here  preclude  a  survey  of  the  work  in  this  interesting 
territory  at  the  intersection  of  computer  science  and  psychology;  the 
formulation  and  validation  of  information  processing  theories  of  human 
problem  solving  and  learning  processes  using  primarily,  but  not  exclusively, 
the  techniques  of  heuristic  programming  and  the  methodology  of  computer 
simulation.  Fortunately,  two  thorough  reviews  (31,  l)  have  appeared 
recently  and  are  recommended.  Nevertheless,  I  can  not  resist  giving  my 
own  personal  set  of  pointers  into  the  literature. 

Problem  Solving:  Newell  (U7,  1*6,  1*3). 

Analysis  of  human  behavior  in  crypto-arithmetic  puzzle  solving 
tasks;  major  methodological  advances  in  analysis  of  human 
problem  solving  "think-aloud"  protocols  and  the  study  of  human 
eye  movements  during  problem  solving. 

Verbal  learning  and  memory:  Simon  and  F*igenbaum  (66);  Gregg  and 
Simon  (28);  Feigenbaum  (20);  Hintznun  (30). 

Further  results  with  Elementary  Perceiver  and  Memorizer  (EPAM) 
model;  reinterpretation  in  terms  of  theory  of  various  levels 
of  human  memory;  extensions  by  Hintzman  to  handle  additional 
phenomena. 

Concept  learning  and  pattern  induction:  Hunt,  Marin,  and  Stone  (32) 
reports  many  experiments  with  Concept  Learning  System  (CIS). 

Simon  and  Kotovsky  (67);  Simon  and  Sumner  (68).  Simple,  elegant 
program  that  handles  sequence  completion  tasks  from  standard 
intelligence  test  also  can  infer  and  extrapolate  patterns  in 
melodies. 

Affect,  emotion,  beliefs:  Tessler,  Enea,  and  Colby  (71);  Abelson 
and  Carroll  (2).  Models  of  human  belief  systems  and  their  use 
in  studying  neurotic  and  "normal"  behavior. 

Simon  (61* ).  Emotional  and  motivational  concomitants  to  cognitive 
processes . 

Judgment:  Kleinmuntz  (3^,  35). 

Model  of  the  clinician's  Judgmental  process  in  constructing 
personality  profile  from  subject's  answers  in  Minnesota  Multi- 
phasic  Personality  Inventory;  and  studies  of  other  clinicians' 
tasks. 
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The  Shape  of  the  Field,  1968:  By  Examination  in  Depth  of  One  Program 


In  attempting  to  characterize  state-of-the-art  in  a  field,  a  careful 
look  at  one  research  effort  is  often  revealing.  As  with  an  etching,  the 
lines  of  the  work — lines  of  ideas,  of  methods  and  techniques,  of  technology 
esiployed— are  seen  with  greater  clarity  from  close  up. 

I  have  chosen  for  the  close-up  the  research  work  with  which  I  am 
personally  involved.  It  is  not  only  more  appropriate  for  me  to  do  this 
than  to  sketch  another's  work,  but  also  the  sketch  carries  with  it  an 
assured  knowledge  of  detail.  Most  important,  the  research  lies  squarely 
in  what  I  consider  to  be  the  mainstream  of  the  A. I.  research  endeavor: 
problem  solving  using  the  heuristic  search  paradigm. 

Our  primary  goal  was  to  study  processes  of  hypothesis  formation  in 
a  complex  task  of  a  scientific  nature  involving  the  analysis  of  empirical 
data.  The  task  environment  chosen  as  the  medium  for  this  study  was  the 
analysis  of  the  mass  spectra  of  organic  molecules:  the  generation  of  a 
hypothesis  to  best  explain  given  mass  spectral  data.  This  is  a  relatively 
new  area  of  organic  chemistry  of  great  interest  to  physical  chemists. 

In  this  sense,  the  problem  .8  not  a  "toy"  problem;  and  a  program  that 
solves  problems  of  this  type  is  a  useful  application  of  A. I.  research 
to  a  problem  of  importance  to  science. 

We  have  written  a  program  to  infer  structural  hypotheses  from  mass 
spectral  data.  The  program  is  called  Heuristic  DENDRAL.  It  was  developed 
at  the  Stanford  University  Artificial  Intelligence  Project  by  a  small 
group  Including  Professor  Joshua  Lederberg  of  the  Stanford  Genetics 
Department,  Dr.  Bruce  Buchanan,  Mrs.  Georgia  Sutherland,  and  me,  with 
the  assistance  of  chemists  and  mass  6pectrooetr ists  of  the  Stanford 


23 


Chemistry  Department.  It  is  an  80,000  word  program  written  in  LISP  for 
the  PDP-6  computer,  and  was  developed  (and  is  run)  interactively  under 
the  time-sharing  monitor  (39,  9*  22). 

Heuristic  DENDRAL  will  perform  the  following  two  classes  of  tasks: 

1.  Given  tfee  mass  spectrum  of  an  organic  molecular  sample  and  the 
chemical  formula  of  the  molecule,  the  program  will  produce  a 
short  list  of  molecular  "graphs"  as  hypotheses  to  explain  the 
given  data  in  the  light  of  the  program's  models  of  mass  spectro- 
metric  processes  and  stability  of  organic  molecules.  The  list 
is  rank-ordered  from  the  most  satisfactory  explanation  to  the 
least  satisfactory. 

2.  If  no  mass  spectrum  is  given,  but  only  a  formula,  the  program 
will  produce  a  list  of  all  the  chemically  plausible  isomers  of 
the  molecule  in  the  light  of  its  model  of  chemical  stability  of 
organic  molecules. 

The  flow  diagram  of  the  system  is  a  closed  loop  consisting  of 
phases  of  data  inspection,  hypothesis  generation,  prediction,  and  test, 
corresponding  closely  to  a  simple  "scientific  method"  loop. 

At  the  heart  of  the  program  is  a  systematic  hypothesis  generator. 

It  is  based  on  an  algorithm  developed  by  Lederberg  called  DENDRAL  which  is 
capable  of  generating  all  of  the  topologically  possible  isomers  of  a 
chemical  formula.  The  generator  is  essentially  a  topologist,  knowing 
nothing  about  chemistry  except  for  the  valences  of  atoms;  but  the 
generating  algorithm  serves  as  the  guarantor  of  the  completeness  of  the 
hypothesis  space,  in  a  fashion  analogous  to  the  legal  move  generator 
in  a  chess  program.  Since  the  generating  process  is  a  combinatorial 
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procedure,  it  produces  for  all  but  the  simplest  molecules  a  very  large 
set  of  structures*  almost  all  of  which  are  chemically  implausible  though 
topologically  possible.  Implicit  in  its  activity  is  a  tree  of  possible 
hypothesis  candidates.  At  the  top  node  of  the  tree  all  the  atoms  are 
found  but  no  structures.  At  the  terminal  nodes*  only  complete  structures 
are  found*  but  no  unallocated  atoms.  Each  intermediate  node  specifies  a 
partially  built  structure  and  a  residual  set  of  atoms  yet  to  be  allocated. 

This  tree  is  the  implicit  problem  space  for  Heuristic  DENDRAL. 

Various  heuristic  rules  and  chemical  models  are  employed  to  control  the 
generation  of  paths  through  this  space*  as  follows: 

1.  A  model  of  the  chemical  stability  of  organic  molecules  based 
on  the  presence  of  certain  denied  and  preferred  subgraphs  of 
the  chemical  graph.  It  is  called  the  a  priori  model  since  it 
is  independent  of  processes  of  mass  spectrometry. 

2.  A  very  crude  but  efficient  theory  of  the  behavior  of  molecules 
in  a  mass  spectrometer*  called  the  Zero-order  Theory  of  Mass 
Spectrometry*  used  to  make  a  rough  initial  discarding  of 
whole  classes  of  structures  because  they  are  not  valid  in  the 
light  of  the  data*  even  to  a  crude  approximation. 

3.  A  set  of  pattern  recognition  heuristic  rules  which  allow  a 
preliminary  interpretation  of  the  data  in  terms  of  the  presence 
of  key  functional  groups,  absence  of  other  functional  groups, 
weights  of  radicals  attached  to  key  functional  groups,  etc. 

It  is  called  the  Preliminary  Inference  Maker.  Its  activity 
allows  the  Hypothesis  Generator  to  proceed  directly  to  the  most 
plausible  subtrees  of  the  space. 
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The  output  of  the  Preliminary  Inference  and  Hypothesis  Generation 
processes  is  a  list  of  molecular  structures  that  are  candidate  hypotheses 
for  an  explanation  of  the  mass  spectrum.  They  are  all  chemically  plausible 
under  the  a  rrlorl  theory  and  valid  explanations  of  the  data  under  our  tero- 
order  theory  of  mass  spectrometry.  Typically  the  list  contains  a  few 
candidates  (but  not  dozens  or  hundreds). 

Next  a  confrontation  is  made  between  this  list  of  "most  likely" 
hypotheses  ar.d  the  data.  For  each  candidate  hypothesis,  a  detailed 
prediction  is  made  of  its  mass  spectrum.  This  is  done  with  a  subprogram 
called  the  Predictor,  a  complex  theory  of  mass  spectrometry  in  computer 
simulation  form.  The  Predictor  is  not  a  heuristic  program.  It  is  an 
elaborate  but  straightforward  procedure  for  deducing  consequences  of  a 
theory  of  mass  spectrometry  extracted  by  us  from  chemists  and  their 
literature.  The  spectral  prediction  for  each  candidate  is  matched 
with  the  empirical  input  data  by  a  process  called  the  Evaluation  Function. 
This  is  a  heuristic,  hierarchical,  non-linear  scoring  procedure.  Some 
hypothesis  candidates  are  immediately  discarded  because  their  predicted 
spectra  fail  certain  critical  confrontations.  The  remainder  are  scored, 
ranked,  and  printed  out  in  rank  order  from  most  to  least  satisfactory. 

For  the  class  of  non-ringed  organic  structures  with  which  we  have 
been  working  up  to  the  present  time,  the  program's  behavior  approaches 
or  exceeds  the  performance  of  post-doctoral  laboratory  workers  in  mass 
spectrometry  for  certain  classes  of  organic  molecules.  These  include 
amino  ac  ids,  with  which  for  tangential  reasons  we  have  done  much  of 
our  work,  and  a  large  variety  of  simple  organic  groups  that,  however, 
turn  out  to  be  considerably  more  complicated  than  amino  acids  from  the 
point  of  view  of  mass  spectrometry. 
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Heuristic  programing  provided  only  the  skeleton  for  the  problem 
solving  processes  of  Heuristic  DENDRAL  and  the  computer  techniques  to 
handle  the  implementation.  The  heuristics  of  chemical  plausibility  of 
structures;  of  preliminary  inference;  of  evaluation  of  the  predictions; 
and  also  the  zero-order  and  complex  theories  of  mass  spectrometry--these 
were  all  extracted  from  our  chemist  colleagues  by  man-machine  interaction, 
with  the  process  carefully  guided  by  one  of  our  research  team.  The  success 
of  this  mixed  discipline  for  pulling  out  of  the  heads  of  practicing  pro¬ 
fessionals  the  problem  solving  heuristics  they  are  using  has  worked  far 
better  than  we  had  any  right  to  expect,  and  we  are  now  considering  further 
mechanization  of  this  process. 

The  Problem  of  Representation  for  Problem  Solving  Systems 

A. I.  research  in  the  remainder  of  the  second  decade  will  be  dominated 
by  a  few  key  problems  of  general  importance.  The  problem  of  representation 
for  problem  solving  systems  is  one  of  these,  and  in  my  view  the  most 
important,  though  not  the  most  immediately  tractable.* 

In  heuristic  problem  solving  programs,  the  search  for  solutions 
within  a  problem  space  is  conducted  and  controlled  by  heuristic  rules. 

The  representation  that  defines  the  problem  space  is  the  problem  solver's 
"way  of  looking  at"  the  problem  and  also  specifies  the  form  of  solutions. 

*1  have  used  the  term  "problem  of  representation  for  problem  solving 
systems"  to  distinguish  this  problem  from  the  much  more  widely 
discussed  data  representation  (and  data  structures)  problem.  I 
believe  that  we  will  find  eventually  that  the  two  sets  of  questions 
have  an  important  intersection,  but  for  the  moment  it  is  best  to 
avoid  terminological  confusion. 
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Choosing  a  representation  that  is  right  for  a  problem  can  improve 
spectacularly  the  efficiency  of  the  solution-finding  process.  The 
choice  of  problem  representation  is  the  Job  of  the  human  programmer 
and  is  a  creative  act.  Amarel  (5)  believes  that  the  process  of  choosing 
and  shaping  appropriate  representations  for  problem  solving  is  the 
essence  of  the  behavior  in  humans  that  wr  call  "creative".  I  agree. 

Some  examples  of  the  impact  of  choice  of  representation  on  problem 
solving  performance  have  been  discussed  in  the  literature.  The  classic 
is  the  so-called  "tough  ”ut"  proposed  by  McCarthy  and  discussed  by 
Newell  (**5).  Mutilate  a  chess  board  by  removing  two  corner  squares  diagonally 
opposed;  can  the  mutilated  board  be  covered  by  dominos?  If  the  standard 
piece -board-move  game  playing  representation  is  employed,  an  enormous  and 
almost  impossible  search  would  have  to  be  conducted  to  discover  that  no 
covering  solution  was  possible.  But  a  choice  of  problem  representation 
involving  the  concepts  of  parity  of  red-black  covering  by  a  domino  and 
of  counting  of  red  and  black  squares  leads  immediately  to  the  solution 
that  no  covering  is  possible  because  two  squares  of  the  same  color  are 
removed  in  the  mutilation. 

Another  example  of  much  greater  complexity  has  been  worked  out  by 
Amarel  for  the  traditional  puzzle  of  transporting  missionaries  and 
cannibals  from  one  bank  of  a  river  to  the  other  with  a  boat  under  certain  • 
constraints  (U).  Amarel  exhibits  a  succession  of  representational  shifts 
for  this  problem,  from  the  one  usually  used  to  a  simple  but  elegant 
matrix-like  representation,  in  terms  of  which  the  solution  is  available 
almost  immediately  by  inspection.  Amarel  has  worked  out  still  another 
example  for  theorem  proving  in  the  propositional  calculus  (4a).  In  fact, 
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as  early  as  1958,  Gelernter  in  the  Geometry  Machine  used  a  diagram  as  an 
auxiliary  problem  representation  to  improve  the  efficiency  of  searching 
the  problem-subproblem  tree. 

Until  very  recently  the  problem  of  representation  has  been  treated  in 
the  literature  by  exploring  a  few  examples  in  detail.  Fortunately,  a  new 
paper  by  Amarel  (6),  offering  a  synthesis  of  his  view  of  problem  solving  and 
representation,  gives  a  clear  formulation  and  extended  discussion  of  this 
difficult  area. 

Why  is  it  that  a  shift  of  problem  representation  can  lead  to  a 
spectacular  change  in  problem  solving  effectiveness?  There  are  many 
reasons;  here  are  a  few.  Each  problem  representation  has  associated 
with  it  a  set  of  specialized  methods  for  manipulating  elements  of  the 
representation.  Shifting  to  a  representation  that  is  "rich"  in  specialized 
theory  and  methods  from  one  that  is  impoverished  in  this  regard  allows  the 
power  of  the  former  to  be  applied  to  the  problem  at  hand.  Similarly, 
specialized  relationships  associated  with  an  appropriate  representation 
can  be  imported  into  the  (often  incomplete)  statement  of  a  problem 
thereby  supplying  missing  but  crucial  augmentations  to  the  proolem 
definition.  An  example  of  this  has  been  exhibited  by  Paige  and  Simon 
( 51 )  for  alcohol-water  mixture  problems  (the  appropriate  representation 
supplies  necessary  conservation  equations).  Finally,  each  representation 
can  have  associated  with  it  a  data  base  of  descriptions  and  facts  that 
become  available  for  incorporation  into  the  problem  statement  or  for 
use  in  controlling  search. 

Amarel  has  discussed  the  mechanization  ol  the  process  of  shift  of 
representation  as  a  step-by-step  process,  involving  an  "evolution"  of 
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each  representation  to  a  somewhat  more  powerful  one  (1*,  Ua).  The 
evolution  is  guided  by  information  about  the  problem  (or  problem  class) 
that  turns  up  during  the  problem  solving  activity  within  a  particular 
representation  of  the  moment. 

The  alternative  to  this  step-by-step  process  is  a  generator  of 
new  representations  as  trial  candidates  in  a  heuristic  search  for  an 
appropriate  representation.  Design  of  such  a  generator  is  a  formidable 
task  at  this  early  stage  in  our  understanding  of  the  representation 
problem.  The  simplest  design,  however,  is  to  generate  the  elements  of  a 
stored  repertoire  of  previously  encountered  or  potentially  useful 
representations.  Such  a  design  was  employed  in  a  program  by  Persson  (52) 
for  the  problem  of  choosing  the  appropriate  representation  of  pattern 
in  a  mixture  of  different  sequence  extrapolation  tasks. 

In  my  view,  the  use  of  the  concept  of  analogy  between  problems  is  a 
crucial  step  in  the  design  of  a  generator  of  representations.  Candidates 
for  an  appropriate  problem  representation  are  searched  for,  discovered, 
and  tried,  by  a  search  process  that  uses  analogical  reasoning  over  a 
store  of  known  representations  (and  their  associated  methods,  data  bases, 
etc.).  Problem  solving  search  using  reasoning-by-analogy  has  received 
surprisingly  little  attention  in  A.  I.  research,  considering  the  importance 
of  the  problem.  The  work  by  Evans  (19)  on  a  program  to  solve  "intelligence 
test"  problems  involving  geometrical  analogies  is  the  only  work  I  can  cite. 

Of  necessity,  these  comments  on  the  problem  of  representation  have 
been  sketchy  in  the  extreme.  But  because  of  the  central  importance  of 
this  problem,  I  felt  the  need  to  focus  attention  on  it.  I  believe  that 
the  long  gestation  period  for  this  problem  is  ending;  that  the  time  is 
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TIP*  for  *  major  push;  that  there  will  be  important  developments  on  this 
front  in  the  next  five  years;  and  that  these  will  come  to  be  viewed  as 
haring  th*  iaae  degree  of  centrality  and  importance  that  now  attarhes 
lts*lf  to  the  heuristic  search  paradigm. 

Center#  of  Excellence 

It  1#  coirfentional  to  survey  research  by  topic  and  unconventional 
to  survey  it  by  places  and  people.  Yet  I  am  frequently  asked  in  con¬ 
versation  some  form  of  the  question:  "In  artificial  intelligence  (par¬ 
ticularly  heuristic  programming),  where  is  the  action?"  There  is  no 
reason  for  not  attempting  to  answer  this  question  in  a  public  forum. 

The  reference  point  in  time  is  mid-1968.  The  emphasis  is  on  a 
substantial  quantity  of  high  quality  research  (ray  assessment). 

In  the  United  States,  the  three  major  research  centers  are  the  A. I. 
projects  at  MIT,  Carnegie-Mellon  University  (nee  Carnegie  Tech),  and 
Stanford  University.  All  three  receive  the  major  portion  of  their 
support  from  the  Advanced  Reaearch  Projects  Agency  (Department  of 
Defense).  All  three  train  substantial  numbers  of  Ph.D.  students  in  the 
A. I.  area.  The  MIT  and  Stanford  projects  use  dedicated  PDP-6  computers 
with  big  core  memories;  the  Carnegie  project  uses  an  IBM  360/67  with  a 
very  large  extended  core  memory. 

At  MIT,  the  more  senior  faculty  and  research  principal#  are  Minsky, 
Papert,  and  to  some  extent,  Weizenbaum  (73);  »t  Carnegie,  Newell  and 
Simon;  at  Stanford,  McCarthy,  Samuel,  Colby,  and  Feigenbaum.  The  Stanford 
group  has  Close  ties  with  the  neighboring  SRI  group  (Nilsson  and  Raphael); 
similarly  the  MIT  group  has  close  ties  with  Bobrov's  group  at  Bolt,  Beranek, 


and  Newman. 
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Citing  some  statistics  from  the  Stanford  lu.iversity  group,  which 
I  obviously  know  best,  there  are  75  people  (faculty,  students,  and  staff) 
associated  with  the  A. I.  project;  about  25  different  research  projects 
underway;  and  a  working  paper  series  of  67  papers.  I  offer  these  figures 
to  indicate  scale  of  effort  at  a  major  center. 

Five  other  centers  deserving  attention  are:  Case  Western  Reserve 
University  (Banerji,  Ernst);  University  of  Wisconsin  (Travis,  Uhr  and 
London);  RCA  Laboratories  at  Princeton  (Amarel);  Heuristics  Laboratory, 
National  Institutes  of  Health  (Slagle);  and  the  University  of  Washington 
(Hunt). 

In  Europe,  no  centers  comparable  to  the  major  American  centers  were 
visible  in  the  first  decade.  In  the  past  few  years,  however,  a  center 
of  the  first  rank  has  arisen  at  the  University  of  Edinburgh,  and  other 
centers  are  emerging  in  Sweden  and  the  SOviet  Union. 

At  Edinburgh,  A. I.  research  is  enshrined  in  a  Department  of  Machine 
Intelligence  and  Perception  (how  forthrightly  can  one  state  one’s  case?). 
The  principals  are  Michie,  Gregory,  Burstall,  Doran,  and  Popplestone. 

They  are  supported  reasonably  generously  by  the  British  Government. 

The  research  ranges  very  broadly  from  the  various  projects  in  inf*,  rmation 
processing  models  of  cognition  and  perception  to  applications  of  these 
models  (10,  11 )  and  development  of  programming  languages  (12).  The 
latest  "score  sheet"  from  the  Department  gives  bibliographic  data  for 
59  "original  research  contributions"  since  19651  This  group  is  responsible 
for  a  major  series  of  collected  papers,  the  Machine  Intelligence  series. 

At  Uppsala  University  in  Swede.  ,  the  Department  of  Computer  Sciences 
is  doing  A. I.  research  ,GPS,  planning,  robot  simulations,  LISP  work). 
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Principal  activist  is  Sandewall,  formerly  at  Stanford.  Psychologists 
interested  in  simulation  of  cognitive  processes  are  participating.  The 
Swedish  Natural  Science  Research  Council  supports  the  research. 

When  I  cast  my  mind's  eye  as  far  off  as  the  Soviet  Union,  the 
image  becomes  fuzzy,  though  I  make  a  determined  effort  to  keep  current 
with  Soviet  computer  science  literature  (particularly  the  area  of 
discussion,  heuristic  programming).  The  few  papers  I  have  seen  are 
motivation-suppliers  or  clarifications  at  points  of  contact  with 
philosophy,  psychology,  and  neurophysiology.  However,  there  are 
talented  groups  at  various  locations  that  are  interested  (and  perhaps 
actively  working)  in  the  area.  These  are: 

Institute  of  Cybernetics,  Ukrainian  Academy  of  Sciences,  Kiev 
(Glushkov,  Amosov,  and  ’oworkers) 

Institute  of  Automation  and  Remote  Control,  Moscow 
(Aizerman's  Laboratory) 

Moscow  State  University,  Department  of  Higher  Nervous  Activity 
(Napalkov) 

Computer  Center,  Siberian  Division,  Academy  of  Sciences  of  USSR, 
Novosibirsk  (Yershov,  Marchuk  and  coworkers) 

All  of  these  are  major  centers,  interested  generally  in  the  problems 
of  A. I.  research.  Whether  the  developers  of  the  chess  program  mentioned 
earlier,  at  the  Institute  of  Theoretical  and  Applied  Physics  in  Moscow, 
are  interested  in  problems  other  than  chess  program  development  I  do 
not  know. 

A  Russian  translation  of  Computers  and  Thought,  edited  by  Napalkov 
and  Orfyeev,  appeared  last  .car  and  was  an  immediate  sell-out.  A 
Scientific-Technical  Comraisrion  on  Heuristic  Programming  has  come  into 
existence  within  the  last  two  years. 
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A  computer  scientist,  writing  in  Izvcstiya,  claims  "solid  successes 
of  Soviet  heuristic  programming"  in  pattern  recognition,  chess,  and 
theorem  proving,  but  cites  lags  in  "breadth  of  the  work  being  done"  and 
"in  equipping  these  projects  with  computer  hardware".  (38)  It  would  be 
useful  for  the  A. I.  field  to  have  a  survey  paper  by  a  Soviet  computer 
scientist  on  Soviet  work  in  heuristic  programming. 
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